# Low Word Error Rate
Phi 4 Mm Inst Asr Singlish
MIT
A multimodal speech recognition model optimized for Singapore English, fine-tuned based on Microsoft's Phi-4 multimodal instruction model, significantly improving recognition of Singapore English's unique phonetic features.
Audio-to-Text
Transformers Supports Multiple Languages

P
mjwong
61
0
Hubert Base Librispeech Demo Colab
Apache-2.0
A speech recognition model fine-tuned from facebook/hubert-large-ls960-ft, trained on the LibriSpeech dataset
Speech Recognition
Transformers

H
vishwasgautam
101
0
Wav2vec2 Base Librispeech Demo Colab
Apache-2.0
This model is a speech recognition model fine-tuned on the LibriSpeech dataset based on facebook/wav2vec2-base, achieving a word error rate of 0.3174 on the evaluation set.
Speech Recognition
Transformers

W
vishwasgautam
14
0
Whisper Large V3 French Distil Dec16 Ct2
MIT
This is a distilled version of Whisper Large V3 specifically optimized for French automatic speech recognition, achieving efficient inference through ctranslate2.
Speech Recognition French
W
Kelno
35
1
Indian Accent English Whisper Finetuned Epoch 15
MIT
An Indian English accent speech recognition model fine-tuned based on OpenAI Whisper-large-v3-turbo, achieving a 7.99% word error rate on Indian English accent datasets
Speech Recognition
Transformers English

I
Tejveer12
21
2
Lite Whisper Large V3 Turbo Acc
Apache-2.0
Lite-Whisper is a lightweight version of OpenAI Whisper compressed using LiteASR technology, maintaining high accuracy while reducing model size.
Speech Recognition
Transformers

L
efficient-speech
7,414
7
Whisper Finetuned
MIT
Whisper-large-v3-turbo fine-tuned model for Indian English accent speech recognition, with a word error rate of 4.39%
Speech Recognition
Transformers English

W
Tejveer12
25
2
Audiox South V1
Apache-2.0
AudioX is a multilingual automatic speech recognition model developed by Jivi AI, specifically optimized for South Indian languages, supporting Tamil, Telugu, Kannada, and Malayalam.
Speech Recognition Other
A
jiviai
148
1
Whisper Large V3 Turbo Shqip
MIT
An Albanian-optimized speech recognition model based on OpenAI Whisper Large v3 Turbo, supporting standard Albanian and Gheg dialect
Speech Recognition
Transformers Other

W
Kushtrim
143
4
Distil Large V3.5
MIT
Distil-Whisper is a knowledge-distilled version of OpenAI Whisper-Large-v3, achieving efficient speech recognition through large-scale pseudo-label training.
Speech Recognition
Transformers English

D
distil-whisper
4,804
25
Voice Clone Large Finetune Final
Apache-2.0
This model is a voice cloning model fine-tuned based on openai/whisper-large-v3, primarily used for speech recognition tasks, achieving a word error rate of 15.3572 on the evaluation set.
Speech Recognition
Transformers

V
neuronbit
37
2
Whisper Large V3 Turbo German Ct2
Apache-2.0
A German speech recognition model based on Whisper Large v3, optimized for German speech processing and recognition
Speech Recognition
Transformers German

W
jimmymeister
38
3
Whisper Large V3 Turbo Common Voice 19 0 Zh TW
MIT
A fine-tuned Traditional Chinese (Taiwan) automatic speech recognition model based on OpenAI Whisper-large-v3-turbo
Speech Recognition
Transformers Chinese

W
JacobLinCool
220
4
Pathumma Whisper Th Large V3
Apache-2.0
Pathumma Whisper Large V3 is a Thai automatic speech recognition model based on the OpenAI Whisper architecture, supporting Thai and English speech transcription tasks.
Speech Recognition
Transformers Supports Multiple Languages

P
nectec
352
4
Whisper Large V3 Turbo German
Apache-2.0
A fine-tuned model for German speech recognition based on Whisper Large v3, specifically optimized for German speech processing and recognition.
Speech Recognition
Transformers German

W
primeline
2,777
33
W2V2 BERT Withlm Malayalam
MIT
A Malayalam automatic speech recognition model fine-tuned based on facebook/w2v-bert-2.0, trained on multiple Malayalam datasets and using a trigram language model trained with the KENLM library.
Speech Recognition
Transformers Other

W
vrclc
65
3
Faster Whisper Large V3 French Distil Dec16
MIT
A distilled French version of Whisper-Large-V3, optimized for inference efficiency by reducing decoder layers while maintaining good performance
Speech Recognition
Transformers French

F
brandenkmurray
25
3
Whisper Large V2 Atcosim Corpus
Apache-2.0
This model is a fine-tuned speech recognition model based on openai/whisper-large-v2, achieving a word error rate of 4.6858 on a specific domain dataset.
Speech Recognition
Transformers

W
daisyyedda
16
2
Wav2vec2 Phoneme
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, focusing on phoneme recognition tasks
Speech Recognition
Transformers

W
Bluecast
189
3
WHISPER SMALL SWAHILI ASR CV 14
Apache-2.0
This model is a fine-tuned speech recognition model based on OpenAI's Whisper large on the Common Voice 14.0 Swahili (SW) dataset, achieving a word error rate (WER) of 25.13%.
Speech Recognition
Transformers Other

W
dmusingu
28
2
Whisper Small Slovenian
Apache-2.0
This model is a fine-tuned speech recognition model based on openai/whisper-small on the Slovenian ASR dataset ARTUR 1.0, supporting Slovenian speech-to-text tasks.
Speech Recognition
Transformers Other

W
samolego
24
3
Whisper Small Turkish V2
Apache-2.0
A speech recognition model fine-tuned on the Turkish Common Voice dataset based on OpenAI Whisper-small
Speech Recognition
Transformers Other

W
atakanince
61
2
Indic Whisper Nodcil
MIT
IndicWhisper is a cutting-edge speech recognition model optimized for Indian languages, excelling in various benchmark tests for Indian languages.
Speech Recognition Other
I
parthiv11
253
3
Indic Whisper Hi Multi Gpu
MIT
IndicWhisper is a cutting-edge speech recognition model optimized for Indian languages, excelling in various benchmarks for Indian languages.
Speech Recognition Other
I
parthiv11
72
4
Whisper Th Large V3 Combined
Apache-2.0
This is a Thai automatic speech recognition model fine-tuned based on OpenAI's Whisper Large V3 model, achieving a 6.59% word error rate on the Common Voice 13 Thai test set.
Speech Recognition
Transformers

W
biodatlab
1,354
9
Haitian Speech To Text
Apache-2.0
A Whisper-based speech recognition model optimized for Haitian Creole, featuring high-accuracy speech-to-text conversion
Speech Recognition
Transformers Other

H
ZeeshanGeoPk
156
1
Parakeet Tdt 1.1b
Parakeet TDT 1.1B is an automatic speech recognition (ASR) model jointly developed by NVIDIA NeMo and Suno.ai, capable of transcribing speech into lowercase English letters.
Speech Recognition English
P
nvidia
12.27k
90
Wav2vec2 Bert CV16 En
An automatic speech recognition (ASR) model fine-tuned on the Common Voice 16.0 English dataset based on w2v-bert-2.0
Speech Recognition
Transformers English

W
hf-audio
1,700
8
Whisper Large V3 French Distil Dec8
MIT
This is a distilled version of the Whisper-Large-V3 French model, optimized for inference speed and memory usage by reducing the number of decoder layers while maintaining good performance.
Speech Recognition
Transformers French

W
bofenghuang
32
4
Whisper Large V3 Atco2 Asr
Apache-2.0
A speech recognition model fine-tuned based on OpenAI Whisper-large-v3, specializing in Air Traffic Control (ATCO) scenarios with a word error rate of 17.04%
Speech Recognition
Transformers

W
jlvdoorn
1,792
5
Whisper Small Turkish Tr Best
Apache-2.0
Turkish speech recognition model fine-tuned based on OpenAI Whisper-small, with a word error rate of 26.34%
Speech Recognition
Transformers

W
erenfazlioglu
61
4
Asr Conformer Transformerlm Librispeech
Apache-2.0
An automatic speech recognition model based on the SpeechBrain framework, using a Conformer encoder and Transformer decoder, trained on the LibriSpeech dataset, supporting English speech recognition.
Speech Recognition English
A
speechbrain
984
7
Whisper Small Ko
Apache-2.0
Korean speech recognition model based on the Whisper Small architecture, fine-tuned on multi-domain Korean datasets
Speech Recognition
Transformers Korean

W
SungBeom
524
13
Git Base Pokemon
MIT
An image caption generation model fine-tuned from microsoft/git-base, trained on Pokemon image dataset
Image-to-Text
Transformers Other

G
jihwaneom
14
0
Whisper Medium Et
Whisper-medium model fine-tuned on approximately 800 hours of diverse Estonian data, suitable for general speech recognition scenarios
Speech Recognition
Transformers

W
TalTechNLP
115
2
Whisper Telugu Large V2
Apache-2.0
A Telugu automatic speech recognition model fine-tuned based on OpenAI Whisper-large-v2, trained on multiple public Telugu datasets
Speech Recognition Other
W
vasista22
156
8
Whisper Telugu Base
Apache-2.0
A Telugu automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper-base, trained on multiple public Telugu datasets
Speech Recognition Other
W
vasista22
279
10
Whisper Large V2 Slovenian
Apache-2.0
This model is a speech recognition model fine-tuned on the Common Voice 11.0 Slovenian dataset based on OpenAI's Whisper Large-V2 model, with a word error rate of 13.83%.
Speech Recognition
Transformers Other

W
DrishtiSharma
53
1
Whisper Kannada Tiny
Apache-2.0
A Kannada automatic speech recognition model fine-tuned based on openai/whisper-tiny, trained on multiple public Kannada ASR corpora
Speech Recognition Other
W
vasista22
119
6
Whisper Large V2 Hindi 2.5k Steps
Apache-2.0
This is a Hindi automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper Large V2, trained on the Common Voice 11.0 dataset with a word error rate (WER) of 10.05%.
Speech Recognition
Transformers Other

W
DrishtiSharma
52
2
- 1
- 2
- 3
- 4
- 5
- 6
- 7
Featured Recommended AI Models